Fast Concurrent Reinforcement Learners
نویسندگان
چکیده
When several agents learn concurrently, the payoff received by an agent is dependent on the behavior of the other agents. As the other agents learn, the reward of one agent becomes non-stationary. This makes learning in multiagent systems more difficult than single-agent learning. A few methods, however, are known to guarantee convergence to equilibrium in the limit in such systems. In this paper we experimentally study one such technique, the minimax-Q, in a competitive domain and prove its equivalence with another well-known method for competitive domains. We study the rate of convergence of minimax-Q and investigate possible ways for increasing the same. We also present a variant of the algorithm, minimax-SARSA, and prove its convergence to minimax-Q values under appropriate conditions. Finally we show that this new algorithm performs better than simple minimax-Q in a general-sum domain as well.
منابع مشابه
Concurrent reinforcement learning as a rehearsal for decentralized planning under uncertainty
Decentralized partially-observable Markov decision processes (Dec-POMDPs) are a powerful tool for modeling multi-agent planning and decision-making under uncertainty. Prevalent Dec-POMDP solution techniques require centralized computation given full knowledge of the underlying model. Reinforcement learning (RL) based approaches have been recently proposed for distributed solution of Dec-POMDPs ...
متن کاملConcurrent Bayesian Learners for Multi-Robot Patrolling Missions
Distributed robot systems have been adopted lately for security purposes, such as in automatic multirobot patrolling of infra-structures. Research has shown that deterministic patrol routes can lead to effective performance. However, they can potentially be predicted by intelligent intruders. This work presents a probabilistic multi-robot patrolling strategy, where each autonomous agent uses Ba...
متن کاملIncreasing PageRank through Reinforcement Learning
This paper describes a reinforcement learning method, derived from collective intelligence principles, for increasing the combined PageRank for a set of domains. This increased rank is achieved through a set of cooperating reinforcement learners that learn, through exploration, how to add links within the set of domains. We show how reinforcement learners using traditional reward functions perf...
متن کاملMulti-Agent Systems of Inverse Reinforcement Learners in Complex Games
Reinforcement Learning (RL) allows an agent to discover a suitable policy to achieve a goal. However, interesting problems for RL become complex extremely fast, as a function of the number of features that compose the state space. The proposed research is to decompose a core problem into tasks with only the features required to solve the task. The core agent then uses the reward for the task, w...
متن کاملDistributed and accumulated reinforcement arrangements: evaluations of efficacy and preference.
We assessed the efficacy of, and preference for, accumulated access to reinforcers, which allows uninterrupted engagement with the reinforcers but imposes an inherent delay required to first complete the task. Experiment 1 compared rates of task completion in 4 individuals who had been diagnosed with intellectual disabilities when reinforcement was distributed (i.e., 30-s access to the reinforc...
متن کامل